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Abstract 

A unique matching is a stated objective of most computational the- 
ories of stereo vision. This report describes situations where humans 
perceive a small number of surfaces carried by non-unique matching of 
random dot patterns, although a unique solution exists and is observed 
unambiguously in the perception of isolated features. We find both cases 
where non-unique matchings compete and suppress each other and cases 
where they are all perceived as transparent surfaces. The circumstances 
under which each behavior occurs are discussed and a possible explana- 
tion is sketched. It appears that matching reduces many false targets 
to a few, but may still yield multiple solutions in some cases through a 
(possibly different) process of surface interpolation. 
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Seeing "Ghost" Solutions in Stereo Vision 

Biological stereo vision computes the depth of an object from the disparity 
in position of points matched between the left and right eye images of the 
object. Matching is a difficult computation, which humans apparently do well. 
Figure la illustrates this difficulty. There are four possible matches of the two 
points in the right image and the two in the left image. Humans see only the two 
matches (dark circles in figure la) that are order-preserving, namely, matching 
left to left and right to right (see [1] and [2]). Furthermore, stereograms with 
many random dots in each image (e.g. figure 3a) increase the ambiguity of 
the matching task: each point has many false targets in the other image. A 
matching algorithm is usually required to resolve such ambiguities and obtain a 
unique correspondence for each feature or patch (e.g. [3] and [4]). This report 
describes situations where humans perceive a small number of surfaces carried 
by non-unique matching of random dot patterns, although a unique solution 
exists and is observed unambiguously in the perception of isolated features. It 
appears that matching reduces many false targets to a few, but may still yield 
multiple solutions. 

Braddick (1978, unpublished results) has extended Panum's limiting case, 
where one eye sees one vertical line and the other sees two (figure lb), by copy- 
ing a random pattern once in one eye and twice in the other with a horizontal 
gap of few pixels. In this case humans perceive two planes, the upper one 
transparent, and this depth perception is more robust than the single line lim- 
iting case. Grimson describes this experiment in his book ([5]), and argues that 
this result is consistent with a unique matching even though multiple matching 
takes place: if matching is done simultaneously from each image to the other, 
the matching from the double image to the other one is indeed unique. Note, 
also, that the perceived depth in the extended Panum's limiting case is con- 
sistent with the single feature perception. We have examined this case further 
and observed the following: first, the disparities of the two planes in the dense 
stereogram can be constant (two planes) or vary in any continuous way, like 
cos or sin (see figure 2a). Also, the same pattern can be copied more than 
once (e.g., three copies in three different disparities), in which case more than 
two planes are perceived (e.g. three, see figure 2b), though it becomes more 
difficult to make sense out of the stereogram. The effect is visible even for a 
very low density of points (0.001). 

A similar extension of the double nail illusion, where the two eyes see two 
vertical lines with possibly different horizontal spacing (figure la), can be made 
to random dot stereograms. Thus the same random pattern in a middle square 
is copied twice in each image, with a horizontal gap of G r pixels in the right 
image and G\ in the left image, see figure 3a and figure 4a (G T = or Gi — 
give the previous case). Each pair of points in this configuration has four pos- 
sible matchings, two mutually exclusive pairs if matching is unique (figure 3b). 
For single features, like lines, only the two matchings marked in figure 3b 
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with full circles are seen, as has been reported before (double nail illusion, see 
[6]). Surprisingly, in the extended case, observers with good stereo vision see 
four planes (with the help of vergence and memory). They are able to judge 
the depth of the "ghost" planes correctly, choosing the correct depth from a 
multiple choice scale. This perception, though, takes time to build and some 
concentration. Some people with reasonable stereo vision see only three planes. 
Others (including the author) do not see the "ghost" planes, but most seem 
to improve with practice. On the other hand, Prazdny's stereo algorithm ([7]) 
for example, an algorithm designed to handle transparent surfaces, will detect 
only two transparent surfaces in this case, those marked by full circles in fig- 
ure 3b. If features, like lines, are added to the stereogram, usually only two 
of them are seen in the two middle planes as expected from the double nail 
illusion experiments. (Note that Kroll and van de Grind [6] also found that one 
observer occasionally saw a third match in the "double- nail" experiment.) This 
configuration, like the previous one, is not restricted to fronto-parallel planes 
only: one can construct tilted planes like in figure 4b, and other surfaces. We 
have tried similar configuration for motion, that is, images like in figure 3a 
are seen one after the other instead of in stereo. However, we couldn't detect 
similar effects, only two moving planes are seen. 

This result seems to suggest that all disparities with sufficient support give 
rise to the perception of a distinct transparent surface. For convenience, we 
define the support of a given disparity at a given pixel to be the value of the 
correlation function between the two images. The correlation for disparity 
d is computed between a window around the pixel in one image (W in the 
caption of figure 6) and a window around the same location in the other image 
translated by d. A sufficient support may be a correlation value sufficiently 
above random, and the corresponding disparity will be called henceforth a 
"solution". To check the above hypothesis we use a special case of the same 
stereogram where G r = Gi, see figure 5. In this case there are three possible 
solutions (disparities with obvious peaks in the support function), and thus 
three transparent planes can be seen in analogy to the previous case. One 
"strong" solution where all the points in one image are matched to all the 
points in the other image, and two "weak" solutions, in which half the points 
in one image are matched to half the points in the other image (see figure 6b). 
Surprisingly, in this case only the "strong" coherent solution is seen, and this 
perception is quite robust. 

It now seems that only solutions with approximately equal support, i.e. 
comparable maxima in the correlation function, are detected, whereas weaker 
solutions are suppressed. To check this hypothesis we use the initial stereogram 
where G r ^ G\, but we double the amount of points in one plane, so that its 
peak in the correlation function will be twice as high as the others (figure 6c). 
In this case all four solutions are still seen, but the "strong" dense solution 
is seen darker. Even if the number of points in one surface is quadrupled, 
the other surfaces are visible. Figure 7a demonstrates a worse case from our 



point of view, where we add new unambiguously matched points to the two 
external surfaces, which together constitute a complete solution of the matching 
problem. Consequently the support of these two solutions is doubled. Still, 
one can see from the figure that the other two solutions are readily seen. We 
conclude that it is not simply the relative strength (value of support) of the 
solutions that determines which of them will be perceived. If to a stereogram 
with G\ = G r , where initially only one plane has been seen, we add an equal 
number of unmatched random dots to each image, the suppressed solutions 
may sometime (at least partially) reappear (see figure 7b). If we add correlated 
points to both images, forming a new plane, it and the dominant plane are seen 
(see figure 7c). Some of the suppressed solutions may then reappear. These 
effects have not been studied thoroughly enough to state conclusive results, 
though. 

One way by which surface interpolation could lead to the perception of mul- 
tiple transparent surfaces is if one or the other unique matching is chosen locally 
and randomly, and interpolation then smoothes across the "holes" . But depth 
determination for isolated features is not random, the ordered matching is con- 
sistently seen. Moreover, it seems that random assignment can not explain the 
qualitative difference between the case with four possible solutions, when four 
planes can be seen (figure 3), and the case with three possible solutions, when 
only one opaque plane is always seen (figure 5). Finally, random assignment 
does not explain the perception of a single surface in periodic stereograms, 
which are characterized by repetitive patterns, like in wall-paper designs. Mul- 
tiple possible matchings of the images exist here, each involving an almost 
complete matching of all points in the two images and creating different depth 
perceptions (see [8], pp. 187, for a summary). However, the different solutions 
are seen, but not simultaneously, one has to "flip" from one to the other using 
eyes vergence and cues from the surroundings (see figure 6.2-2* in [8]). Thus 
it seems possible that locally multiple matchings are detected, maintained and 
manipulated through the process of interpolation, even though eventually one 
unambiguous match is chosen for each local feature. 

A possible surface interpolation procedure that agrees with the above re- 
sults will now be briefly discussed, though by no means this is the only possible 
explanation. Initially surfaces are constructed that take into account all possi- 
ble matches of all the pixels. We define the support set of a surface to be the 
set of points, say within a window of size W, that contribute to its construc- 
tion. We assume W is large enough so that random matches corresponding 
to all disparities are abundant. A surface is maintained if it has a sufficiently 
large support set that is not included in the support set of a different surface. 
With simple transparencies (see [7] and [9]) all possible solutions will survive, 
as their support sets are disjoint. In figure 3 all four solutions will survive due 
to random matches, whereas in figure 5 the "strong" solution will dominate. 
Finally, all the solutions in a periodic stereogram have overlapping support sets 
(of all the points in the window) almost everywhere. In agreement with the 



observations, the above scheme predicts that each solution can suppress the 
other, depending upon additional cues or vergence (see [10]). 

The main implication of the above results is the conclusion that the reso- 
lution of ambiguities is not as simple as practically all computational theories 
of stereo vision assume (e.g. [4]). A unique solution is a stated objective of 
these theories. The above results will require their extension to describe sur- 
face interpolation which extends multiple possible matchings. Possibly, feature 
matching and disambiguation and surface interpolation are different processes 
(see similar suggestion by Mitchison in [11]). 

Acknowledgements: I thank Scott Kirkpatrick for his invaluable help in dis- 
cussion, advice and criticism. For very helpful comments and advice I thank Graeme 
Mitchison, Tomaso Poggio and Heinrich Biilthoff. 
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Figure 1: a) Ambiguous matching or the double nail illusion: there is one 
"natural" order preserving matching of L\ to R\ and L 2 to R 2 . This solution is 
marked in full circles, and is always perceived, even when the correct matching 
is Li to i? 2 and L 2 to R\ (the "ghost" solution). This effect is demonstrated 
in an experiment where two nails are put one behind the other with respect to 
the viewer. A depth illusion is then created where the two nails are seen one 
beside the other in the same depth ([6]). b) Panum's limiting case: here one eye 
sees two lines and the other sees only one. Most people perceive the two lines 
differing in depth, as if they match the single line in one image simultaneously 
with the two lines in the other (if the distance between them is sufficiently 
small). 






a) 







■;...■■.-..■ ■ ■?■■■■ A~^M^f-^Mf v ,.^sr*-'.'{;, ;f--.--h::->v. .<.-.••-..• : ■'•' ■ vw«. '■'• j'„ ■•■•■;• - ■■■ if '.■:■■?■?::■■:■+ 



b) 



Figure 2: Variations on the extended Panum's limiting case: a) cos and its 
mirror image, b) three planes. 
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Figure 3: a) An ambiguous stereogram with G r — 2 pixels and G\ = 4 pixels. 
After some staring one can see four planes: two in front of the background, 
the background (all three transparent), and one behind the background. The 
deepest plane is usually the most difficult to see, and it often helps to slightly 
diverge the eyes to capture it. Note the two lines ("nails") in the middle of 
the stereogram, which are usually seen on the two middle surfaces only, and 
which are hard to flip to the other surfaces. Because it takes some time for 
the impression to build, it is recommended to use a stereo viewer about 5.5 
inches high, b) A graphic illustration of the projections of two points from the 
stereogram in a. The two solutions that are mutually exclusive if matching is 
unique are separately marked by filled and hollow circles, c) The depth profile 
of the four possible solutions of the stereogram in a. 













Figure 4: Variations on figure 3a: a) smaller disparities, 5-6 minutes of arc 
only, are separating two nearby planes, so that it is easier to see the four 
planes together but more difficult to distinguish them; b) two fronto-parallel 
planes and two tilted planes that create an "X" shape between them in depth 
can be seen. 
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Figure 5: a) An ambiguous stereogram with G r — G\ = 2 pixels. Here only 
one opaque plane is seen, the background, and no vergence can help detect the 
other two planes (one above the background and one below it), b) A graphic 
illustration of the projections of two points from the stereogram in a. c) The 
depth profile of the three possible solutions of the stereogram in a. 
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Figure 6: The correlation between the left and the right images at their center: 
EY Ef /(sf , yf)f{xf + D, yf). The X-axis is the disparity D: a) stereogram 
of figure 3, b) stereogram of figure 5, c) stereogram of figure 3 with additional 
points to one solution. 
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Figure 7: a) like figure 3a, where the number of points in the two external planes 
have been doubled with new unambiguous points; b) like figure 5a, where the 
number of points in each image is doubled with new random unmatched points 
(noise); c) like figure 5a, with an additional new uncorrelated plane at disparity 
4, double the disparity of one of the suppressed planes (2 and -2). 
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